82 research outputs found

    Towards a geometrical model for polyrepresentation of information objects

    Get PDF
    The principle of polyrepresentation is one of the fundamental recent developments in the field of interactive retrieval. An open problem is how to define a framework which unifies different as- pects of polyrepresentation and allows for their application in several ways. Such a framework can be of geometrical nature and it may embrace concepts known from quantum theory. In this short paper, we discuss by giving examples how this framework can look like, with a focus on in- formation objects. We further show how it can be exploited to find a cognitive overlap of different representations on the one hand, and to combine different representations by means of knowledge augmentation on the other hand. We discuss the potential that lies within a geometrical frame- work and motivate its further developmen

    Determining the polarity of postings for discussion search

    Get PDF
    When performing discussion search it might be desirable to consider non-topical measures like the number of positive and negative replies to a posting, for instance as one possible indicator for the trustworthiness of a comment. Systems like POLAR are able to integrate such values into the retrieval function. To automatically detect the polarity of postings, they need to be classified into positive and negative ones w.r.t.\ the comment or document they are annotating. We present a machine learning approach for polarity detection which is based on Support Vector Machines. We discuss and identify appropriate term and context features. Experiments with ZDNet News show that an accuracy of around 79\%-80\% can be achieved for automatically classifying comments according to their polarity

    Exploiting information needs and bibliographics for polyrepresentative document clustering

    Get PDF
    In this paper we explore the potential of combining the principle of polyrepresentation with document clustering. Our idea is discussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic information is used as representation. The main idea is to present the user with the highly ranked polyrepresentative clusters to support the search process. Our evaluation suggests that our approach is capable of increasing retrieval performance, but performance varies for queries with a high or low number of relevant documents

    Multi-facet classification of e-mails in a helpdesk scenario

    Get PDF
    Helpdesks have to manage a huge amount of support requests which are usually submitted via e-mail. In order to be assigned to experts e ciently, incoming e-mails have to be classi- ed w. r. t. several facets, in particular topic, support type and priority. It is desirable to perform these classi cations automatically. We report on experiments using Support Vector Machines and k-Nearest-Neighbours, respectively, for the given multi-facet classi - cation task. The challenge is to de ne suitable features for each facet. Our results suggest that improvements can be gained for all facets, and they also reveal which features are promising for a particular facet

    Applying Cross-cultural theory to understand users’ preferences on interactive information retrieval platform design

    Get PDF
    Presented at EuroHCIR 2014, the 4th European Symposium on Human-Computer Interaction and Information Retrieval, 13th September 2014, at BCS London Office, Covent Garden, London.In this paper we look at using culture to group users and model the users’ preference on cross cultural information retrieval, in order to investigate the relationship between the user search preferences and the user’s cultural background. Initially we review and discuss briefly website localisation. We continue by examining culture and Hofstede’s cultural dimensions. We identified a link between Hofstede’s five dimensions and user experience. We did an analogy for each of the five dimensions and developed six hypotheses from the analogies. These hypotheses were then tested by means of a user study. Whilst the key findings from the study suggest cross cultural theory can be used to model user’s preferences for information retrieval, further work still needs to be done on how cultural dimensions can be applied to inform the search interface design

    Combining cognitive and system-oriented approaches for designing IR user interfaces

    Get PDF
    Poster at the AIR workshop 2008, London, Englan

    Scalable DB+IR technology: processing Probabilistic Datalog with HySpirit

    Get PDF
    Probabilistic Datalog (PDatalog, proposed in 1995) is a probabilistic variant of Datalog and a nice conceptual idea to model Information Retrieval in a logical, rule-based programming paradigm. Making PDatalog work in real-world applications requires more than probabilistic facts and rules, and the semantics associated with the evaluation of the programs. We report in this paper some of the key features of the HySpirit system required to scale the execution of PDatalog programs. Firstly, there is the requirement to express probability estimation in PDatalog. Secondly, fuzzy-like predicates are required to model vague predicates (e.g. vague match of attributes such as age or price). Thirdly, to handle large data sets there are scalability issues to be addressed, and therefore, HySpirit provides probabilistic relational indexes and parallel and distributed processing. The main contribution of this paper is a consolidated view on the methods of the HySpirit system to make PDatalog applicable in real-scale applications that involve a wide range of requirements typical for data (information) management and analysis

    Identifying the relevance of personal values to e-government portals' success: insights from a Delphi study

    Get PDF
    Most governments around the world have put considerable financial resources into the development of e-government systems. They have been making significant efforts to provide information and services online. However, previous research shows that the rate of adoption and success of e-government systems vary significantly across countries. It is argued here that culture can be an important factor affecting e- government success. This paper aims to explore the relevance of personal values to the e-government success from an individual user’s perspective. The ten basic values identified by Schwartz were used. A Delphi study was carried out with a group of experts to identify the most relevant personal values to the e-government success from an individual’s point of view. The findings suggest that four of the ten values, namely Self-direction, Security, Stimulation, and Tradition, most likely affect the success. The findings provide a basis for developing a comprehensive e-government evaluation framework to be validated using a large scale survey in Saudi Arabia

    Preliminary study of technical terminology for the retrieval of scientific book metadata records

    Get PDF
    Books only represented by brief metadata (book records) are particularly hard to retrieve. One way of improving their retrieval is by extracting retrieval enhancing features from them. This work focusses on scientific (physics) book records. We ask if their technical terminology can be used as a retrieval enhancing feature. A study of 18,443 book records shows a strong correlation between their technical terminology and their likelihood of relevance. Using this finding for retrieval yields >+5% precision and recall gains

    A Probabilistic Framework for Information Modelling and Retrieval Based on User Annotations on Digital Objects

    Get PDF
    Annotations are a means to make critical remarks, to explain and comment things, to add notes and give opinions, and to relate objects. Nowadays, they can be found in digital libraries and collaboratories, for example as a building block for scientific discussion on the one hand or as private notes on the other. We further find them in product reviews, scientific databases and many "Web 2.0" applications; even well-established concepts like emails can be regarded as annotations in a certain sense. Digital annotations can be (textual) comments, markings (i.e. highlighted parts) and references to other documents or document parts. Since annotations convey information which is potentially important to satisfy a user's information need, this thesis tries to answer the question of how to exploit annotations for information retrieval. It gives a first answer to the question if retrieval effectiveness can be improved with annotations. A survey of the "annotation universe" reveals some facets of annotations; for example, they can be content level annotations (extending the content of the annotation object) or meta level ones (saying something about the annotated object). Besides the annotations themselves, other objects created during the process of annotation can be interesting for retrieval, these being the annotated fragments. These objects are integrated into an object-oriented model comprising digital objects such as structured documents and annotations as well as fragments. In this model, the different relationships among the various objects are reflected. From this model, the basic data structure for annotation-based retrieval, the structured annotation hypertext, is derived. In order to thoroughly exploit the information contained in structured annotation hypertexts, a probabilistic, object-oriented logical framework called POLAR is introduced. In POLAR, structured annotation hypertexts can be modelled by means of probabilistic propositions and four-valued logics. POLAR allows for specifying several relationships among annotations and annotated (sub)parts or fragments. Queries can be posed to extract the knowledge contained in structured annotation hypertexts. POLAR supports annotation-based retrieval, i.e. document and discussion search, by applying an augmentation strategy (knowledge augmentation, propagating propositions from subcontexts like annotations, or relevance augmentation, where retrieval status values are propagated) in conjunction with probabilistic inference, where P(d -> q), the probability that a document d implies a query q, is estimated. POLAR's semantics is based on possible worlds and accessibility relations. It is implemented on top of four-valued probabilistic Datalog. POLAR's core retrieval functionality, knowledge augmentation with probabilistic inference, is evaluated for discussion and document search. The experiments show that all relevant POLAR objects, merged annotation targets, fragments and content annotations, are able to increase retrieval effectiveness when used as a context for discussion or document search. Additional experiments reveal that we can determine the polarity of annotations with an accuracy of around 80%
    • …
    corecore